NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Generating Secure Artificial Intelligence Model Source Code: A Reinforcement Learning Approach

https://doi.org/10.1109/SPW67851.2025.00037

Kathikar, Adhishree; Lazarine, Ben; Gao, Yang; Shah, Ankit; Samtani, Sagar (May 2025, IEEE)

Free, publicly-accessible full text available May 15, 2026
Improving Data Efficiency via Curating LLM-Driven Rating Systems

Pang, Jinlong; Wei, Jiaheng; Shah, Ankit; Zhu, Zhaowei; Wang, Yaxuan; Qian, Chen; Liu, Yang; Bao, Yujia; Wei, Wei (April 2025, The Thirteenth International Conference on Learning Representations)

Free, publicly-accessible full text available April 24, 2026
Improving Data Efficiency via Curating LLM-Driven Rating Systems

Pang, Jinlong; Wei, Jiaheng; Shah, Ankit; Zhu, Zhaowei; Wang, Yaxuan; Qian, Chen; Liu, Yang; Bao, Yujia; Wei, Wei (April 2025, The Thirteenth International Conference on Learning Representations)

Instruction tuning is critical for adapting large language models (LLMs) to downstream tasks, and recent studies have demonstrated that small amounts of human-curated data can outperform larger datasets, challenging traditional data scaling laws. While LLM-based data quality rating systems offer a cost-effective alternative to human annotation, they often suffer from inaccuracies and biases, even in powerful models like GPT-4. In this work, we introduce DS2, a Diversity-aware Score curation method for Data Selection. By systematically modeling error patterns through a score transition matrix, DS2 corrects LLM-based scores and promotes diversity in the selected data samples. Our approach shows that a curated subset (just 3.3% of the original dataset) outperforms full-scale datasets (300k samples) across various machine-alignment benchmarks, and matches or surpasses human-aligned datasets such as LIMA with the same sample size (1k samples). These findings challenge conventional data scaling assumptions, highlighting that redundant, low-quality samples can degrade performance and reaffirming that "more can be less."
more » « less
Free, publicly-accessible full text available April 24, 2026
Improving the Adversarial Robustness of Machine Learning-based Phishing Website Detectors: An Autoencoder-based Auxiliary Approach

https://doi.org/10.24251/hicss.2025.050

Gao, Yang; Samtani, Sagar; Shah, Ankit (January 2025, HICSS)

Full Text Available
LLM Unlearning via Loss Adjustment with Only Forget Data

Wang, Yaxuan; Wei, Jiaheng; Liu, Chris Yuhao; Pang, Jinlong; Liu, Quan; Shah, Ankit; Bao, Yujia; Liu, Yang; Wei, Wei (April 2025, Thirteenth International Conference on Learning Representations)

Free, publicly-accessible full text available April 24, 2026
LLM Unlearning via Loss Adjustment with Only Forget Data

Wang, Yaxuan; Wei, Jiaheng; Liu, Chris Yuhao; Pang, Jinlong; Liu, Quan; Shah, Ankit; Bao, Yujia; Liu, Yang; Wei, Wei (April 2025, Thirteenth International Conference on Learning Representations)

Free, publicly-accessible full text available April 24, 2026
Impact of microelectrode geometry and surface finish on enzymatic biosensor performance

https://doi.org/10.1016/j.electacta.2024.145270

Xu, Jian; Fratus, Marco; Shah, Ankit; Nolan, James K; Lim, Jongcheon; Lee, Chi Hwan; Alam, Muhammad A; Lee, Hyowon (January 2025, Electrochimica Acta)

Full Text Available
A Machine Learning and Optimization Framework for Efficient Alert Management in a Cybersecurity Operations Center

https://doi.org/10.1145/3644393

Ghadermazi, Jalal; Shah, Ankit; Jajodia, Sushil (June 2024, Digital Threats: Research and Practice)

Cybersecurity operations centers (CSOCs) protect organizations by monitoring network traffic and detecting suspicious activities in the form of alerts. The security response team within CSOCs is responsible for investigating and mitigating alerts. However, an imbalance between alert volume and available analysts creates a backlog, putting the network at risk of exploitation. Recent research has focused on improving the alert-management process by triaging alerts, optimizing analyst scheduling, and reducing analyst workload through systematic discarding of alerts. However, these works overlook the delays caused in alert investigations by several factors, including: (i) false or benign alerts contributing to the backlog; (ii) analysts experiencing cognitive burden from repeatedly reviewing unrelated alerts; and (iii) analysts being assigned to alerts that do not match well with their expertise. We propose a novel framework that considers these factors and utilizes machine learning and mathematical optimization methods to dynamically improve throughput during work shifts. The framework achieves efficiency by automating the identification and removal of a portion of benign alerts, forming clusters of similar alerts, and assigning analysts to alerts with matching attributes. Experiments conducted using real-world CSOC data demonstrate a 60.16% reduction in the alert backlog for an 8-h work shift compared to currently employed approach.
more » « less
A Novel Team Formation Framework based on Performance in a Cybersecurity Operations Center

https://doi.org/10.1109/TSC.2023.3253307

Shah, Ankit; Ganesan, Rajesh; Jajodia, Sushil; Cam, Hasan; Hutchinson, Steve (January 2023, IEEE Transactions on Services Computing)

Full Text Available
Maintaining the level of operational effectiveness of a CSOC under adverse conditions

https://doi.org/10.1007/s10207-021-00573-4

Shah, Ankit; Ganesan, Rajesh; Jajodia, Sushil; Cam, Hasan (June 2022, International Journal of Information Security)

Full Text Available

« Prev Next »

Search for: All records